Evolutionary method for finding communities in bipartite networks 
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An important step in unveiling the relation between network structure and dynamics denned 
on networks is to detect communities, and numerous methods have been developed separately to 
identify community structure in different classes of networks, such as unipartite networks, bipartite 
networks, and directed networks. Here, we show that the finding of communities in such networks 
can be unified in a general framework — detection of community structure in bipartite networks. 
Moreover, we propose an evolutionary method for efficiently identifying communities in bipartite 
networks. To this end, we show that both unipartite and directed networks can be represented as 
bipartite networks, and their modularity is completely consistent with that for bipartite networks, 
the detection of modular structure on which can be reformulated as modularity maximization. 
To optimize the bipartite modularity, we develop a modified adaptive genetic algorithm (MAG A), 
which is shown to be especially efficient for community structure detection. The high efficiency of 
the MAG A is based on the following three improvements we make. First, we introduce a different 
measure for the informativeness of a locus instead of the standard deviation, which can exactly 
determine which loci mutate. This measure is the bias between the distribution of a locus over the 
current population and the uniform distribution of the locus, i.e., the Kullback-Leibler divergence 
between them. Second, we develop a reassignment technique for differentiating the informative 
state a locus has attained from the random state in the initial phase. Third, we present a modified 
mutation rule which by incorporating related operation can guarantee the convergence of the MAGA 
to the global optimum and can speed up the convergence process. Experimental results show that 
the MAGA outperforms existing methods in terms of modularity for both bipartite and unipartite 
networks. 

PACS numbers: 89.75.Hc, 02.10.Ox, 02.50.-r 



I. INTRODUCTION 

Complex network has gained overwhelming popularity 
as a powerful tool for understanding various complex sys- 
tems from diverse fields, including the technical, natural, 
and social sciences, etc., which provides a unified per- 
spective or method for studying these systems through 
modeling them as networks with nodes and edges re- 
spectively representing their units and interactions be- 
tween units [lH6|. Generally, according to the types of 
node, networks can be classified into unipartite, bipar- 
tite and multipartite networks. As a typical class of real- 
world networks, bipartite networks, compared to unipar- 
tite ones, consist of two types of nodes, and edges exist 
only between distinct types of nodes. Examples of bipar- 
tite networks come from various fields, including scien- 
tific collaboration networks, actor-movie networks, and 
protein-protein interaction networks [l|, 0, 0-HJ . Multi- 
partite networks with more than three types of nodes, 
are occasionally seen [Io|, [H| • 

It has been discovered Q that most real networks share 
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a local clustering feature, i.e., groups of tight- knit nodes 
mutually connected to each other with sparser edges. 
These groups of nodes are generally referred to as com- 
munities or modules. From a topological point of view, a 
community may correspond to a functional unit because 
of its relative structural independence. In turn, commu- 
nity structure can critically affect diverse dynamics on 
networks. Therefore, identification of communities plays 
a key role in numerous related areas of complex networks, 
e.g., predicting protein function [l2| and determining dy- 
namics of systems p^| - [l5j . The last few years have wit- 
nessed tremendous efforts in this direction [8l. ll5l - l26l ] (use- 
ful reviews include Refs. [13, HH). Most previous studies 
are dedicated to deal with unipartite networks, while lit- 
tle attention has been paid to directed networks [23|, l24j 
and bipartite networks [2^ - |26| . 

It is of interest that unipartite and directed networks 
can be represented by bipartite networks as will be 
shown. Thus, detection of communities in unipartite net- 
works or in directed networks can be transformed into the 
same task in bipartite networks. Given a bipartite mod- 
ularity, those methods based on modularity maximiza- 
tion pL6j - Tl9[ | , in principle, can be applied to bipartite net- 
works. However, they are expected to be affected by the 
resolution limit (29l . |30| as in the unipartite case, which 
may result in the degeneracy problem [3l|. This poses a 
challenge for the methods that return one solution. In- 
stead, we present a modified adaptive genetic algorithm 
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to optimize the bipartite modularity [26| . The evolution- 
ary method can return a better solution in a shorter time. 
Moreover, the method also can return multiple better so- 
lutions in multiple runs, which enables us to evaluate the 
reliability (or significance) of solutions without resorting 
to other technique as in [32|, [33J, as well as to obtain 
a superior solution by combining several better solutions 
when the degeneracy problem is severe. 

In practice, there exist two distinct conceptual under- 
standings of the community structure of a bipartite net- 
work. The first viewpoint for communities in the network 
is to consider each composed of two types of nodes with 
dense edges across them, which is similar to the view 
of unipartite cases 26] . An alternative view is that any 
community should contain only one type of nodes, which 
are closely connected through co-participation in several 
communities that consist of another type of nodes [zij . 
Guided by this view, the usual approach to identifying 
communities is to project the bipartite network onto one 
specific unipartite network as needed, and then identify 
communities in the projection. Guimera et al. [2~i| re- 
cently presented a method for identifying communities 
of one type of nodes against the other type of nodes with 
a known community structure. 

In this paper, we focus on dealing with the problem 
of identifying communities from the first viewpoint. We 
present a modified adaptive genetic algorithm (MAG A), 
based on the mutation-only genetic algorithm (MOGA), 
which is parameter-free unlike the traditional genetic al- 
gorithms. The method has no need to know in advance 
the number of communities and their sizes. In Sec. UH wc 
first give a short review of Barber's modularity [26| and 
then show that unpartite networks and directed networks 
can be uniformly represented by bipartite networks. Af- 
ter the description of the MOGA in Sec. IIII Al we intro- 
duce a different measure for selecting loci to mutate in 
Sec. IIII B[ and then develop the reassignment technique 
in Sec. IIII CI Further, we discuss how to select the popu- 
lation size in Sec. IIII Dl and address the issues of conver- 
gence and time complexity of the MAGA in Sec. IIII El 
In Sec. IIVI we apply the algorithm to model bipartite 
networks, several real bipartite networks, and unipartite 
networks. Finally, the conclusion is given. 



II. BIPARTITE MODULARITY 

The modularity introduced by Newman and Girvan Q 
aims at quantifying the goodness of a particular division 
of a given network, and has been widely accepted as a 
benchmark index to measure and to compare the accu- 
racy of various methods of community detection. The 
definition of this quantity is based on the idea that com- 
munity structure definitely means a statistically surpris- 
ing arrangement of edges, that is, the number of actual 
edges within communities should be significantly beyond 
that of the expected edges of a null model. In turn, a null 
model should have the same number of nodes and degree 



distribution as the original network, while the edges of 
the null model are placed by chance. 

Let ki be the degree of nodes i, and M the total number 
of edges. Since in the null model 18] the probability for 
an edge being present between nodes i and j is 
the modularity quantifying the extent of the number of 
actual edges exceeding the expectation based on the null 
model network, can be formulated as follows: 

i=i j=i v ' 

where Q is the sum of the difference over all groups of the 
particular division, N is the network size, Ajj indicates 
the adjacent relation between nodes i and j, gi represents 
the group the node i is assigned to, and the S function 
takes the value of 1 if gi equals gj , otherwise. 

The value of Q ranges from -1 to 1. Given a network, a 
larger value generally indicates a more accurate division 
of the network into communities. Community structure 
detection thus can be formulated as a problem of mod- 
ularity maximization, which often works well although 
it may suffer from a resolution problem [29|, |30| . But 
on the other hand, due to the resolution limitation and 
the random fluctuation effect |34j, it appears preferable 
for the divisions delivered by maximization modularity 
app roaches to give an evaluation of their reliability [3ll — 

MM. 

The above modularity is actually designed for uni- 
partite networks. To be suitable for various networks, 
several variations of modularity based on different null 
models have been proposed, including weighted (36j . di- 
rected [2a |. and bipartite modularity [U [26j]. A bi- 
partite network with N nodes can be conveniently de- 
noted by a duality (p,q) (p + q = N), where p and q 
respectively represent the numbers of the two types of 
nodes. We can renumber nodes such that in the sequence 
1, 2, ■ • • ,p,p + 1, • • ■ ,N, the leftmost p indices represent 
the first type of nodes and the remainder represent the 
second type of nodes. Then, Barber's bipartite mod- 
ularity [261 ] . which considers a community composed of 
distinct types of nodes in the network, can be written as 

i—i j=p+i ^ ' 

Immediately, a subtle difference between the two mod- 
ularities in Eqs. (fT]) and © can be observed. It is of 
interest that a unipartite network can be equivalently 
represented as a bipartite one, and the bipartite modu- 
larity can recover the modularity for the original network. 
If each node i is represented by two nodes A% and Bi and 
each edge i-j represented by two edges Ai-Bj and Aj-Bi, 
then a unipartite network with N nodes and M edges is 
transformed into a corresponding bipartite network with 
2N nodes and 2M edges. For example, the transforma- 
tion of a simple unipartite network is shown in Fig. [1] 
Further, if we label N nodes Aj with 1,2, ... ,N and la- 
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FIG. 1: Transformation of a simple unipartite network into a 
bipartite one. (a) An unipartite network with five nodes and 
six edges, (b) The bipartite network corresponding to (a). 



bel Bi with N + 1, N + 2, . . . , 2N, then an edge i-j in the 
original network corresponds to two edges, i-(N + j) and 
j-(N + i). Using the bipartite modularity introduced in 
Eq. on the induced bipartite network, we have 



1 N 2N ( kk-\ 

Q> = 2mI1 E K;- 2^)%^;) 

i=l j=N+l v 7 
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»=ij'=i v 7 
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where we have made use of the fact that the node Ai 
and Bi should be in an identical community and have 
the same degree. Thus, bipartite modularity can also be 
used to community detection in unipartite networks after 
being transformed. 

We then turn to the modularity for directed unipartite 
networks, which are another important class of networks. 
The directed network can analogously be transformed to 
a bipartite network. A node i is represented by two nodes 
Ai and Bi as in unipartite networks, while a directed edge 
from i to j is represented as an edge between Ai and Bj , 
that is, set {Ai} and set {Bi} are the sources and the 
sinks. Again, using the Eq. and the fact above, we 



obtain 

1 N 2N / k k \ 

= ^E E 

i=l j=N+l v 7 

= ^EE ( A i,N+i> - 6(g t ,g N+r ) 
i=i j'=i v 7 

i N N ( h out h in \ 

1=1 3=1 \ / 

where the term on right-hand side in the last equation 
is just the modularity for directed networks presented 
in [231 ] . The method for transforming directed networks 
into bipartite ones has been proposed by Guimera et 
al [lH, but their bipartite modularity is distinct from 
Barber's, as mentioned before. 

Consequently, the bipartite network can be considered 
as a wider class of networks that provides a generic case 
for the problem of community structure detection. And 
Barber's bipartite modularity can served as a uniform 
objective for these methods of identifying communities 
based on optimization. 

III. EVOLUTIONARY METHOD FOR 
COMMUNITY DETECTION 

As a class of general-purpose tools to solve various hard 
problems, genetic algorithms have found wide application 
in bioinformatics, computer science, physics, engineering, 
and other fields. They are, based on the Darwinian prin- 
ciple of survival of the fittest, a kind of global optimiza- 
tion method simulating evolutionary processes of species 
in nature [37j . 

The evolutionary methods are easy to implement, and 
the process can be described as follows. The methods 
start with a stochastically created initial population with 
predefined size wherein individuals are known as chro- 
mosomes representing a set of feasible solutions to the 
problem at hand, with each associated with a fitness 
value. Then chromosomes are selected in proportion to 
their corresponding fitness so that those fitter individuals 
would will have multiple copies and less fit will be dis- 
carded in the new population. Next, genetic operators 
such as crossover and mutation are performed accord- 
ing to the respective specified ratios on the population. 
After these operations, the population of the next gener- 
ation has been reproduced. The above process is iterated 
to evolve the current population toward better offspring 
until the termination criterion is met. 

Since the number of divisions on any given network 
grows at least exponentially in the network size, the op- 
timization of modularity is clearly an NP-hard problem 
that has been given a rigid proof in [38| , which has moti- 
vated an array of heuristic methods includi ng g reedy ag- 
glomeration Q, simulated annealing (SA) [la ], spectral 
relaxation (SR) [13, Gil, extremal optimization (EO) [l9| 
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and mathematical programming [39(. All these meth- 
ods perform a point-point search, that is, transformation 
from one solution to a better one, and are susceptible to 
trapping in a local optimum. In contrast, genetic algo- 
rithms work with a population of solutions instead of a 
single solution. This implies that genetic algorithms are 
more robust because they perform concurrent searches 
in multiple directions which would make them effectively 
find better solutions. 

However, for practitioners, a fundamental important 
problem is to choose appropriate parameters such as 
crossover rate and mutation rate, because they will seri- 
ously affect the performance of genetic algorithms. Fur- 
thermore, these parameters are closely related to the 
studied problems, and even for the same problem, they 
should adjust themselves in the course of the search. 
In the following, we would like to introduce an adap- 
tive genetic algorithm recently presented by Szeto and 
Zhang 40] and then propose a modified version suited 
for community structure detection. 



A. Mutation only Genetic algorithm 

Traditional genetic algorithms assume that genetic op- 
erators indiscriminately act on each locus constituting 
the chromosome, but this is not always the case. Indeed, 
the recent research in human DNA [4l| shows that mu- 
tation rates at different loci are very different from one 
another. Inspired by this, Ma and Szeto [42[ reported 
on a locus-oriented adaptive genetic algorithm (LOAGA) 
that makes use of the statistical information inside the 
population to tune the mutation rate at an individual lo- 
cus. Szeto and Zhang [4(| further presented a new adap- 
tive genetic algorithm, called the mutation only genetic 
algorithm (MOGA), which generalized the LOAGA by 
incorporating the information about the loci statistics in 
the mutation operator. In the MOGA, mutation is the 
only genetic operator, and the only required parameter is 
the population size. The MOGA was readdressed by Law 
and Szeto in [43j . wherein it was extended to include a 
crossover operator. Here, the description for the MOGA 
is given on the basis of the later version. 

The population matrix P has Np stacked chromosomes 
with length L, with its entries Pij(t) representing the al- 
lele at locus j of the chromosome i at time (or genera- 
tion) t. The rows of this matrix are ranked according to 
the fitness of the chromosomes in descending order, i.e., 
/(*) > f(k) for i < k. The columns are ranked according 
to the standard deviation <Jt(j) (its definition will given 
below) of alleles at locus j such that <J t {j) > <r t (k) for 
j < k. In the MOGA, the fitness cumulative probabil- 
ity, as an informative measure for chromosome i relative 
to the landscape of fitness of the whole population, was 
introduced and defined as 



C(i) 



where N(g) is the number of chromosomes whose fit- 
ness values equal g. Subsequently, the standard deviation 
<Jt{j) over the allele distribution, as a useful informative 
measure for each locus j, is defined as 



\ 



(6) 



where the weighting factor C{i) reflects the informative 
usefulness of the chromosome i, and hj(t) is the mean of 
the alleles at locus j, given by 



, N P 



(7) 



(5) 



9</0) 



A locus with a smaller allele standard deviation was 
considered to be more informative than other loci, and 
vice versa. Indeed, this really makes sense in limited sit- 
uations. For the initial population, the alleles at each 
locus j should satisfy a uniform distribution, so the stan- 
dard deviation cr t (j) will be very high while the locus 
present is not informative. A typical optimization prob- 
lem generally allows for a few global optima, so the loci 
with higher structural information are liable to take fewer 
alleles than allowed, thereby having smaller allele stan- 
dard deviations. Therefore, the loci with higher devia- 
tions prefer mutating while the other loci (informative 
loci) remain to guide the evolution process. 

Now we can describe the process for the MOGA. In 
each generation, we sweep the population matrix from 
top to bottom. Each row (a chromosome) is selected for 
mutation, with probability a(i) = 1 — C(i). According 
to Eq. ([5]), we have -J- < C(i) < 1. Then, a chromo- 
some with a higher fitness value has fewer chances to be 
selected, and vice versa. In Particular, the first chromo- 
some that has the highest fitness value will never be se- 
lected for mutation, while the last one will almost always 
undergo mutation for a large enough Np, if Np nor mally 
takes a value from 50 to 100 as De Jong suggesed (44j |. 
for example, a(N P ) = 1- ^ = 0.98 for N P = 50. If the 
current chromosome selected is i, then the number N(i) 
of loci for mutation is prescribed as N(i) — a(i) x L. 
Thus, a selected chromosome with a higher fitness value 
has fewer loci to mutate, so that most of the informative 
loci remain; while a selected chromosome with a lower 
fitness value has more less-informative loci to mutate. In 
practice, we can mutate the N(i) leftmost loci because 
they are less informative than others according to the 
above arrangement of loci. 

Overall, the MOGA was expected to have a two-fold 
advantage over traditional genetic algorithms: first, be- 
cause there is no need to input parameters except the 
population size it can be more available for solving var- 
ious problems; second, the mechanism of adaptively ad- 
justing parameters can make it more effectively perform 
and obtain better solutions if it works as expected. 
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FIG. 2: Encoding schema of a chromosome for a bipartite 
network (p,q). Bj k (for k < p) is the allele at the locus rep- 
resenting node Ak, which stands for a neighbor node of 
in the network. Similarly, Ai k (for k < q) is the allele at the 
locus representing node B^. 



TABLE I: Example of a population with three chromosomes. 
Fitness is calculated on the division induced from decoding 
the chromosome. Values in each column are the alleles at the 
locus. 



Chro. 


Fitness 


Loci 


Loc.2 


Loc.3 


Loc.4 


Ri 


0.5 


100 


20 


4 


8 


Rz 


0.3 


100 


50 


5 


12 


Rz 


0.2 


10 


50 


6 


7 


a 




36.7243 


15.8114 


0.8165 


2.0412 



B. Measure for the informativeness of loci 

Despite these possible advantages, the MOGA cannot 
be directly applied to community structure detection due 
to a drawback that will be shown. Instead, we present a 
modified version of the MOGA, i.e., the MAGA, which is 
especially suited for the problem of community structure 
detection. We note that genetic algorithms have been 
applied to this problem in [45|, [4(| , but these applications 
are based on standard genetic algorithms (SGAs). 

We begin with the encoding schema of the genetic al- 
gorithm for finding communities in a bipartite network. 
A useful representation is the locus-based adjacency rep- 
resentation presented by Park and Song in [47( where it 
was used in clustering data and also has been used for 
community detection [46j. In this encoding schema, a 
chromosome consists of A*" loci with a locus for a node 
in the network, and the allele at a locus j is the label 
of one neighbor of node j in the network. In this way, a 
chromosome actually induces a graph that is often dis- 
connected because of the reduction in connectivity rel- 
ative to the original network. Given the connectivity of 
the community, decoding the division from a chromosome 
then amounts to finding all the connected components of 
the induced graph. For simplicity, we also call them the 
connected components of the chromosome. 

Now, we apply the encoding schema to the case of bi- 
partite networks. Given a bipartite network (p, q), we la- 
bel its nodes as noted above, i.e., we label nodes of type 
A with 1, 2, • • • ,p while we label another type of nodes 
with p+1, • • • , N. Then a chromosome R for the network 
can be represented as that shown in Fig. [2] Since our ob- 
jective is to find a division with as higher a modularity 
as possible, the fitness function can be defined directly in 
terms of the modularity. Based on the above representa- 
tion for the chromosome, this function becomes 

1 p N / k k \ 

i=l j=p+l ^ ' 

(8) 

where the parameter ttr emphasizes that the division on 
which the modularity is calculated is encoded by chro- 
mosome R. 

Recall that in the MOGA the allele standard devia- 
tion is used to pick the loci to mutate. When applied 
to community structure detection, however, the measure 
generally will misguide the algorithm. Consider the sim- 



ple case in which the population consists only of three 
chromosomes, R\, R2, and R3, which in turn consist of 
four loci that have three alleles. Table U shows the allele 
distribution at these loci. 

From Eqs. ([5]) and ([6]), the allele standard deviations 
for the four loci, <J\, 02,^3 and 04 (henceforth, we omit 
the parameter t for simplicity), can be calculated to ob- 
tain 

Cl > (72 > 0-4 > 03. (9) 

According to the selection criterion of the loci to mutate 
in the MOGA, <j\ has the highest standard deviation and 
will be picked out. 

In fact, the informativeness of a locus implies a certain 
bias, and vice versa. The initial population is gener- 
ated randomly and each locus follows an approximately 
random distribution. From the uniform distribution, we 
have nothing on the structure of the optimal solution 
to the given problem. With gradual evolution, more and 
more fit members of the population will assume the same 
alleles at some loci, which may suggest some structural 
information of the optimal solutions; that is, the bias (or 
deviation) from the random distribution indicates the in- 
formativeness of the locus. In the simplest case such as 
the knapsack problem where each locus takes the value 
1 or 0, the allele standard deviation amounts to the bias 
and the MOGA can work well [ifjj . 

For the current case, loci 3 and 4 should be selected 
with equally higher priority because their allele distribu- 
tions are equally closer to their respective random distri- 
butions. Both loci 1 and 2 appear with a certain bias on 
their alleles, indicating that they are more informative 
than others. If the informativeness of each chromosome 
is taken into account, however, they are evidently differ- 
ent from one another. Locus 1 has a larger bias since the 
chromosomes with the same allele 100, i.e., R\ and R2, 
have higher fitness. In contrast, locus 2 has a smaller bias 
since the chromosomes with the same allele 50, i.e., R2 
and i?3, have lower fitness. Therefore, the correct order 
of mutation is 

locus 3=locus 4 > locus 2 > locus 1 , (10) 

where the equality means that the pair of loci have the 
same priority for mutation. Obviously, the allele stan- 
dard deviation would severely misguide the MOGA in 
the current case. 
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The failure of the allele standard deviation stems from 
the fact that this measure is closely related to alleles 
at loci. However, the information contained in loci is 
actually not relevant to the particular values but solely 
determined by the bias relative to the random distribu- 
tion. The method of measurement of the bias is thus 
crucial. Fortunately, we can use the Kullback-Leibler di- 
vergence [48[ to describe the bias. 

In the formalism of the MAG A, we explicitly represent 
a locus j as a discrete random variable Xj , and an allele 
at the locus is a value that Xj can take. Note that in the 
following the set of all alleles at the locus is denoted by 
Xj as well. Then the rai 
can be formally given by 



Xj as well. Then the random distribution at the locus 



Q(Xj = x) = 



for each x £ Xj, 
otherwise. 



(11) 



Let the allele distribution over the population be V, de- 
fined by 



V(Xj = x) 



(12) 



We can mathematically define the bias /i as the Kullback- 
Leibler divergence between the two distributions, V and 
Q: 

moo = J2 = x) log ^:^ (is) 

The base of log is irrelevant, but it will change the value 
of bias, and in the following all the logs are taken to base 
2. It is noteworthy that the quantity OlogO should be 
interpreted as zero. As a Kullback-Leibler divergence, 
the bias is always non-negative and is zero if and only if 
V = Q. The intuitive explanation is that the amount of 
information a locus contains is always non-negative, and 
that we have to roll an unbiased dice if we have not any 
knowledge about something. Conversely, we can predict 
that an event will inevitably occur only when we have 
complete information about it. 

Reconsidering the above example, we obtain [i\ = 
0.863, Hi = 0.585, and ^3 = /14 = 0.5145. As a smaller 
bias indicates poorer information a locus contains, the 
locus should undergo mutation. Conversely, a larger bias 
means richer informativeness, and the locus should re- 
main. Therefore, guided by the bias, the order of muta- 
tion is locus 3,4,2,1 or 4,3,2,1, which completely match 
the order in Eq. ([TU| . 

Furthermore, it can be observed that locus 2 has zero 
bias if it has only two alleles. The difference coming 
from the change of number of alleles would be normally 
concealed by the allele standard deviation. For these 
reasons, the bias appears superior to the allele standard 
deviation. A better alternative is to use the normaliza- 
tion of the bias as in our MAGA, which ranges from to 
1 being divided by log \Xj\. 



C. The reassignment technique for the locus 
statistic 

It is so far acknowledged that the loci with random 
distributions should have the highest priority for muta- 
tion. However, in the community detection case this pre- 
supposition does not always hold. After the evolution 
of a certain number of generations, some communities or 
their main bodies will appear at the population scale. At 
present, a locus with a random distribution does not nec- 
essarily imply that it contains no information and should 
undergo mutation immediately. Generally, there exist in 
the network many nodes whose neighbors are all (or al- 
most all) in the same communities and have a similar 
connection pattern or even are structurally equivalent 
nodes [49( that are connected to the same nodes. For 
such a node, if all (or most) of its neighbors presenting 
in the same connected component predominates in the 
current population, then the locus has a random distri- 
bution or an approximately random distribution. There- 
fore, we are required to differentiate the cases to avoid 
such misguiding. 

The reassignment technique is designed to deal with 
this problem. For a chromosome R, the element x is the 
allele at the locus j which is a neighbor of the node j. 
Check whether the component in which j lies includes 
other neighbors with smaller labels in the original net- 
work. If it is true and the neighbor with the smallest 
label is y, then the contribution from R, ■^wz-i that 
should be assigned to x now is reassigned to y if x ^ y. 
In this way, forward sweeping of the population matrix 
can obtain an updated allele distribution at the locus 
over the population, given by 

J2s(i,j)=x /w 



V*{Xj = x) 



Eifd) 



(14) 



where S(i,j) is the node j's neighbor with the smallest 
label that lies with j in the same component of the chro- 
mosome i. 

TABLE II: Example of reassignment technique. Column 1 
lists four chromosomes, column 2 is the fitness of the chro- 
mosomes, column 3 shows the alleles of locus 1, and the right 
four columns show whether the corresponding nodes are in 
the same connected component as node 1, with 1 indicating 
yes and no. 



Chro. 


Fitness 


Loci 


Loc.2 


Loc.3 


Loc.4 


Loc.5 


Rl 


0.28 


2 


1 


1 








R2 


0.25 


3 





1 


1 





R3 


0.25 


4 


1 





1 


1 


R4 


0.22 


5 





1 





1 



An example using the technique as shown in Table [II] 
Using Eq. (fl~2"j). it is obvious that the locus 1 has an 
approximately random distribution and thus the bias is 
close to 0. Recalculating the distribution with the re- 
assignment technique, however, we have V*{X\ = 2) = 
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(b) 

FIG. 3: Two possible schemes for changing the allele at locus 
j, where nodes represent loci and the directed edge j — > i rep- 
resents that the present allele at locus j is i, while undirected 
edges are irrelevant to the reassignment process. The black 
node is the node (locus) j, the nodes with dashed border are 
the allele nodes in this component, the gray ones are the in- 
fluenced nodes and the others are indifferent ones, (a) The 
new target node 1 (new allele) is in the subgraph elicited from 
the node 3 (the present allele at locus j). (b) The new target 
node 1 is not in the subgraph elicited from the node 3. 



ing from the node j. Since the subgraph elicited from 
node j is connected to the rest of the component through 
j, this travel must end in a node that has passed. Let 
the path be j — > x\ — > X2 — > ■ ■ ■ — >• Xk-i — > %k- 
When Xk ^ j, we can reestablish the connectivity by 
removing the last edge, reversing the direction of each 
edge in the path, and adding a new edge xi(3) — > j. 
Note that the resultant graph meets with the constraint 
that any node has only one outgoing edge. Therefore, 
we can reset the alleles at those loci involved in the 
path. For example, in Fig. [UJb), the entire path is 
j — > 3 — > 2 — > 6 — > 5 — > 7 — > 2, so we can set the al- 
leles according to the path, 7— ;» 5 — > 6 — > 2 — > 3 ^ j. 
Now, the allele at the locus can be set to 1. As for the 
case Xk — j, we can directly alter the alleles as in the 
scheme in Fig.[3Ja). 

In the reassignment technique, we can also reassign the 
contribution from the chromosome to the allele with the 
maximum label that lies in the same component when 
performing locus statistics. More generally, the method 
can also work as long as we arbitrarily specify a fixed re- 
assignment order for each locus, although different pre- 
scriptions may produce different biases. 

Clearly, the reassignment technique is very useful for 
community structure detection although it would not 
work when applied to loci that have a single allele, i.e., 
the corresponding nodes in the network are leaves. More- 
over, this special case can be readily eliminated by for- 
bidding the mutation, which may bring about the addi- 
tional merit that it naturally reduces the complexity of 
the problem. Since most real-world networks are scale- 
free where substantial number of leaf nodes exist, this 
merit will be very significant for finding communities in 
such networks. 



0.53, V*{X 1 = 3) = 0.47, and V*{X 1 = 4) = P*(X 1 = 
5) = 0, which is very different from the random distribu- 
tion with bias 1.0026. 

The idea behind the technique is well understood. 
Given a locus j, we can replace the present allele with any 
other allele that lies in the same component in a way that 
does not alter the connectivity of the component hence 
causing no change in the division encoded by the chromo- 
some. To show its feasibility, we focus on the component 
in which j lies. Recall that a locus represents a node 
and the allele at the locus represents the unique neigh- 
bor the node adheres to. Consequently, the component is 
in the form of a directed graph with unitary out-degree 
for each node. There exist two possible schemes as shown 
in Fig. |3l Note that the undirected edges are irrelevant 
to the reassignment process; thus their directions can be 
disregarded. 

In the scheme depicted in Fig. EJa), we can directly 
change the allele from 3 to 1 but still maintain the con- 
nectivity of the component. For the scheme in Fig. [3jb), 
however, such direct altering of the allele will split the 
original component. To deal with this case, we study 
the travel in the component along directed edges, start- 



D. Population size 

As in the MOGA, the unique parameter required to be 
provided in the MAGA is the population size. The pa- 
rameter may have significant influence on the application 
of genetic algorithms. De Jong's experiment on a small 
suite of test functions showed [44| that the best popula- 
tion size was 50-100 for these functions. There are also 
other empirical studies and theoretical analyses of this 
parameter [5(| [H|. In practice, De Jong's setting has 
been widely adopted, which may be because this choice 
gives a good tradeoff between the quality of the solution 
and the cost of computation in many cases. 

This popularity of the setting, however, does not ex- 
clude the development of genetic algorithms working with 
a variable population size. A few examples of the class 
of algorithms can be found in 

[EMU. 

Although one of 

these mechanisms may be beneficial to be incorporated 
into MAGA, in this work we does not take it into account. 

Since we expect that all alleles at a locus can simulta- 
neously appear in the population, a population size that 
is greater than the degrees of most nodes in the net- 
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work would be preferable . As mentioned before, most 
real-world networks are scale-free, so the degrees of most 
nodes in these networks are smaller than 50. Considering 
this fact and the cost of large population size, we would 
like to take a fixed value from the interval between 50 
and 200. 



E. Convergence and its speeding up 

The MOGA was reported to perform well in the ap- 
plication to solve the knapsack problem [40], where all 
the loci have two alleles, and 1. For many cases, how- 
ever, its performance will be hindered by two factors. 
One factor is the misguiding by the allele standard devi- 
ation mentioned above. The other is that in the evolution 
of each generation the fittest individual(s) actually will 
not participate in the mutation unless others supersede 
it (them). 

In fact, despite fulfilling the elite preservation strat- 
egy HE HH that assures convergence for a SGA toward 
the global optimum, the MOGA does not guarantee such 
convergence and even may end with a nonlocal optimum 
solution. Consider a case where the Np — 1 fittest chro- 
mosomes have identical fitness and the remaining one 
has a lower fitness value. Those fittest should be passed 
to the next generation while the remaining one will mu- 
tate with very high probability. If the mutation happens 
to produce a chromosome with the same fitness as the 
others, this will unexpectedly terminate the evolutionary 
process. 

Moreover, it is helpful to notice that the present fittest 
chromosomes, if not a local optimum, always can perform 
a local search to reach a local optimum. Consequently, 
it is preferable to modify the rule for mutation so as to 
allow for local search, which refers to performing a ran- 
dom mutation on a single locus. The mutation operation 
is powerful in that it may lead to a node moving between 
different components, a component splitting or two com- 
ponents merging. 

Interestingly, we found that in many cases it may be 
useful for the local search to perform a special split- 
ting operation with a low probability (for example, 0.1). 
The splitting operation on a component drawn randomly 
can be implemented by a bipartitioning in the spectral 
method [U EE H3] ■ Let the number of edges in the com- 
ponent be M c . For the power method, it needs O(N) 
multiplications to converge the lead vector of a matrix of 
size N, which leads to a run time 0(N 2 ) for a biparti- 
tion in the spectral method [17| . In order to not increase 
the time complexity of each generation's evolution, the 
multiplication is executed at most N log N/M c times. 

Combining the above considerations, the overall pro- 
cedure of the MAGA for community detection can be 
described as follows. 

(1) The connectivity of the network of interest is fed 
into the MAGA. The algorithm then creates Np initial 
feasible solutions, each locus of which is initiated with a 



random allele. 

(2) At each generation, the MAGA first duplicates 10% 
the fittest chromosomes of previous generation for the 
current generation. 

(3) The MAGA then reproduces 0.9A P individuals by 
selecting from the previous generation in proportion to 
their fitness to prepare for mutation. 

(4) The fitness and the fitness cumulative probability 
for chromosomes are evaluated using Eqs. ([5} and ([5]), 
respectively; immediately, the bias for each locus using 
Eqs. (|13p and (fT4"f is evaluated, and then these loci are 
ranked according to their biases. 

(5) The individuals reproduced in step 3 are swept, 
and the chromosome i selected with the same probabil- 
ity 1 — C(f(i)) as in the MOGA; if the chromosome is 
chosen then the mutation aforementioned is performed, 
otherwise a local search for the fitter individuals is per- 
formed. 

(6) Steps 2-5 are repeated until a certain termination 
criterion has been met. Otherwise, the MAGA outputs 
the best partition with the highest fitness. 

Since in step 3 the fitter individuals incline to be repro- 
duced because of their higher fitness, step 4 enables the 
reproduced fittest individuals always to perform a local 
search. Step 2 maintains the elite preservation strategy 
in case of the destruction of the strategy in step 5. In 
this way, the MAGA not only can converge to the global 
optima, also can speed up the process. 

The most time-consuming operations in each genera- 
tion are evaluating the bias and fitness with O(M) time, 
and ranking the loci with O(NlogN) time. This rank- 
ing operation has seemingly slightly higher complexity 
than an 0{M) operation if the network is sparse. In 
fact, it can be performed faster than those operations 
with 0(M) time since the latter need to be repeated Np 
times. Therefore, the overall time cost for each genera- 
tion of the MAGA is O(M) like that of SGAs. 

IV. RESULTS 

In this section, we empirically study the effective- 
ness of the MAGA by applying it to model bipartite 
networks and several real bipartite networks. In both 
cases, we show that MAGA is superior to SGAs and the 
MOGA [5!|, and it also can compete with the nice BRIM 
(bipartite, recursively induced modules) 26] algorithm 
that dedicated to bipartite networks. We also tested the 
performance on several real unipartite networks, com- 
paring with several well-known methods for unipartite 
networks in the literature. 



A. Model bipartite networks 

To test how well our algorithm performs, we have ap- 
plied it to model bipartite networks with a known com- 
munity structure. A model network can be constructed 
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in two steps. The first step is to determine the layout 
of nodes in the network, i.e., to specify the number of 
communities Nc, and the numbers of nodes of two types 
included in each community Na and Nb, as well as to 
assign group membership to these nodes. Next, the dis- 
persion of edges is determined by specifying the intra- 
community and intercommunity link probabilities p- ln and 
p ou t, such that p in > p out . 

For simplicity, all communities assume the same values 
of N A and N B . We set N c = 5, N A = 12, and N B = 8 
as used in 26[ . One might expect that as p m is markedly 
greater than p ont the networks exemplifying the model 
have significant community structure that tends to be 
detected. Conversely, as p out approaches p; n , the net- 
work examples become more uniform and their modular 
structure becomes more obscure. In this experiment, p- ln 
is fixed at the value of 0.9 while p out is varied by tuning 
Pout/pin from 0.1 to 0.9 with steps of 0.1. We have tested 
on such models the performance of the MAGA as well as 
of the SGA and the MOGA, each exemplified with ten 
networks. On each example we ran these algorithms ten 
times. 

For evaluating the quality of solutions, both the modu- 
larity and the normalized mutual information (NMI) [27j 
are useful. But the NMI is more suitable for the current 
case since the optimal (correct) division of the model net- 
work is known in advance. This measure takes its maxi- 
mum value of 1 when the found division perfectly matches 
with the known division while it takes 0, the minimum 
value, when they are totally independent of each other. 
Accordingly, we employed the stop criterion that the al- 
gorithms reach the predefined generation size (maximum 
number of generations) or the NMI reaches its maximum 
value. 

Figures H] and [5] display the performance comparison 
between such genetic algorithms for p ou t/Pm = 0.1 and 
Pout/fin = 0.2, respectively. The generation size is set to 
2000. For both cases, the MAGA and the SGA remark- 
ably outperform the MOGA. From Fig. @] (a), we can see 
that the the MAGA is appreciably faster than the SGA, 
although both perform well since the mutual information 
rapidly exceeds 0.9. In our test, each run of MAGA on 
all ten example networks consistently gave the optimal 
division, i.e., produced 100 numbers of generations less 
than 2000. For the SGA, 97 runs gave the optimum di- 
vision. Their distributions of the number of generations 
needed to reach the optimum, reported in Figs. E{b) and 
[4jc) further reveal their difference in speed (in terms of 
the number of generations). 

When Pout/Pin = 0.2, it is more difficult to identify 
their community structure of the example networks rela- 
tive to the previous ones. The SGA succeeded in obtain- 
ing the optimum division in 32 runs. In sharp contrast, 
each run of the MAGA gave the optimum division. More 
information on the distributions of the number of gen- 
erations is provided in Fig. [5] (a). Also, in Fig. [SJb), 
the variations of the mutual information with regard to 
the SGA and the MAGA illuminate that there exists a 
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FIG. 4: (Color online) Performance on bipartite model net- 
works with pi n = 0.9 and Pout/pin = 0.1. The generation size 
is set to 2000. (a) Variation of normalized mutual informa- 
tion over first 500 generations, (b) Distribution of the number 
of generations needed to reach the optimum using the SGA. 
More than half the number of generations are over 200. (c) 
Distribution of the number of generations needer to reach the 
optimum using the MAGA. There are 83 runs in which the 
number of generations is less than 200. 



greater performance difference between them than in the 
case of pout/Pin = 0.1. 

Even for p ou t/Pm = 0.1, the MOGA was not observed 
to reach the optimum solution in its first 2000 generations 
was not observed. Actually, the MOGA performed so 
poorly that it was even much slowly than the SGA as 
shown in Figs. U (a) and 0(b). We argue that the main 
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FIG. 5: (Color online) Performance on model networks with 
Pin = 0.9 and p ou t/pin = 0.2. The generation size is set to 
2000. (a) Distributions of the number of generations to reach 
the optimum using the SGA and MAGA. There are 32 black 
circle points and 100 red box points respectively representing 
the number of generations needed to reach the optimum using 
the SGA and MAGA. Most numbers of generations for the 
SGA are distributed above 1000 while for the MAGA most are 
below 800. (b) Variation of normalized mutual information 
with the number of generations. Each point is the average 
over the 100 runs. 



reason for this is that the use of an incorrect informative 
measure for the loci has misguided the algorithm. 



We have made a more extensive performance compar- 
ison. Figure shows the variations of accuracy of the 
MAGA and SGA as well as BRIM against changes of 
Pout/Pin- For the model networks, assigning each of the 
nodes from the smaller groups to its own module is a bet- 
ter strategy for BRIM that will lead to a precise division. 
To be fair [6(j|, we picked the best division from the ten 
runs on each sample network and then averaged over ten 
examples for a particular p ou t/Pin- 
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FIG. 6: (Color online) Variation of performance of the algo- 
rithms with different p ut/j»m- Each point is the average over 
ten sample networks. For p ut/f>in = 0.1 and 0.2, the gener- 
ation size is set to 2000; for other values, the size is set to 
3000. 



B. Southern women network 

As the first example of a real bipartite network, we 
study the southern women network [61| . The social net- 
work consists of 18 women and 14 events for which the 
data were collected by Davis et al. in the 1930s, describ- 
ing the participation of the women in these events. It 
has been extensively used as a typical instance for inves- 
tigating the problem of finding cohesive groups hidden in 
social networks; see Ref. j25| for a useful review. 

We have performed the MAGA ten times on this net- 
work, with the population size 100 and the generation 
size 3000. Unlike the BRIM algorithm for which ini- 
tial state is important, initial states are generally ir- 
relevant (or weaker relevant) for genetic algorithms to 
they can succeed in finding a quite good solution. For 
each run, the MAGA found the best solution so far, with 
Q = 0.3455. 

Figure [7] shows the community structure identified in 
the southern women network using the MAGA. This di- 
vision is exactly the same as that found with BRIM with 
the initial strategy that begins with assigning all events 
to a single community. We have also applied the SGA 
and the MOGA to this network with the same popu- 
lation size and generation size. A simple performance 
comparison between them is shown in Table IIII1 which 
lists the success times for reaching the best solution, the 
minimum (MinGen) and maximum number of genera- 
tions (MaxGen) to reach the optimum, and the aver- 
age normalized mutual information (I^ orm ), and average 
modularity (Q*). 

No matter what we are concerned about, the speed or 
the quality, the MAGA again has an evident advantage 
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TABLE IV: Performance comparison on the southern women 
network, where some data are drawn from [261 ] . 



Method 


Communities 


Q* 


-^norm 


MAGA 


4 


0.3455 


l 


BRIM 1 


4 


0.3455 


l 


BRIM 2 


2 


0.3212 


0.5803 


Davis 1 


2 


0.3106 


0.4466 


Davis 2 


2 


0.3184 


0.4513 


Doreian 


3 


0.2939 


0.6077 




FIG. 7: (Color online) Southern women network (dashed lines 
indicate the division found by the MAGA). Each community 
consists of those nodes with the same color (level of scale), 
including women and events represented by box and triangle 
events respectively. 



over the SGA and MOGA. Table HVl shows the accuracy 
of the MAGA in comparison with other methods. 
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TABLE III: Performance comparison between the SGA, 
MOGA and MAGA on the southern women network. Each 
algorithm runs ten times. Here, success means that the algo- 
rithms have found the best solution before they reaches the 
generation size 3000. 



Method 


Succ. 


MinGen. 


MaxGcn. 


-^norrn 


Q* 


SGA 


4 


2011 


2924 


0.8923 


0.3454 


MOGA 









0.7997 


0.3448 


MAGA 


10 


87 


1830 


1 


0.3455 



Most previous studies assigned these women to groups 
depending on their interests. Davis et al. [6l[ assigned 
the women to two groups, labeled 1-9 and 9-18. Woman 
9 can be considered as an overlapping node of the two 
groups in a sense, but should be exclusively included in 
one group by the currently used community definition. 
We may label the division with 9 and 1-8 in the same 
group as "Davis 1" , and the alternative division (9 is 
grouped with 10-18) as "Davis 2". 

Doreian et al. [62J took the definition of a bipartite 
community composed of two types of nodes and proposed 
several divisions, with the accuracy of the division with 
the highest modularity shown in Table IIVI We call the 
BRIM algorithm using the strategy of (1) assigning all 
events to a single module and (2) assigning each event to 
its own module "BRIM 1" and "BRIM 2," respectively. 
Barber (2(| reported its accuracy when using such strate- 
gies on the network; these results also can be found in 
Table El 



FIG. 8: (Color online) Distributions of the divisions returned 
by the SGA, MOGA and MAGA. The black horizontal line 
indicates the best bipartite modularity reported in [2(J using 
the BRIM algorithm. 



C. Scotland corporate interlock network 

The second real-world bipartite network we have used 
as a test on is the Scotland corporate interlock net- 
work [63[ . This network describes the corporate interlock 
pattern between 136 directors and the 108 largest joint 
stock companies during 1904-1905. As it is disconnected, 
we focus merely on its largest component, which com- 
prises 131 directors and 86 firms. In the following, the 
word "network" consistently indicates this component. 

The BRIM algorithm found poorer divisions of this 
network with Q = 0.5663 and Q = 0.3987, using the 
strategies of assigning all directors to unique modules or 
to the same module. With the adaptive binary search 
technique, the BRIM algorithm, when using the strategy 
of randomly assigning directors to modules, may find a 
much better solution with Q = 0.663(±0.002). Based 
on the experimental results, the author of [26| suggested 
that the network comprise approximately 20 communi- 
ties. 

Similarly, we have examined the performance of three 
algorithms on this network by running ten times with the 
same settings as before. Figure |5] shows the distributions 
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of the solutions returned by SGA, MOGA, and MAGA. 
Obviously, both the SGA and MAGA definitely exhibit 
higher accuracy than BRIM and the MOGA. 

Moreover, the MAGA appears preferable to the SGA. 
In the experiment, the modularity of the best division 
found by the SGA, Q = 0.7070, is less than those of the 
best two divisions (wi and tt2) found by the MAGA with 
Q = 0.7093 and Q = 0.7089. On the other hand, as 
shown in Fig. [U for the MAGA most of the ten divisions 
including the best two are found during the first 2000 
generations while for the SGA six of the ten divisions are 
found after 2000 generations. 

In closing, we would like to give a simple evaluation of 
the reliability of the solutions. We calculated the normal- 
ized mutation information between any pairs of solutions 
returned by the MAGA. The maximum value of the NMI 
is between tti and ~Ki and is equal to 0.9191, indicating 
that they are very similar. Simultaneously, for each solu- 
tion, we calculated the average of the NMI between that 
division and other divisions. We found that 7T2 has the 
largest value, 0.8459, and 7Ti has the third largest value, 
0.8248. These facts lend confidence in the reliability of 
the optimum divisions obtained, 7Ti and iT2- Figure [3] 
shows the community structure of this network accord- 
ing to 7T2. Clearly, the MAGA indeed has given a very 
accurate division of this network. 



D. Unipartite networks 



a good tradeoff between speed and accuracy. As shown in 
Table [V] the MAGA almost consistently outperforms the 
EO and SR methods for these networks. Interestingly, for 
the Zachary network the MAGA found the accurate so- 
lution with Q = 0.4198 [UlzH, while neither the EO nor 
SR nethod can find it in spite of the fact that the net- 
work is very simple. Furthermore, for the larger networks 
the gap in performance tends to widen; for example, the 
maximum modularity difference approaches 18% (11%) 
relative to the EO (SR) method for the largest network 
studied. 

Even when compared with SA method, which is widely 
considered as the most accurate modularity maximiza- 
tion method, the MAGA may give a higher modularity 
while significantly reduce the time cost. In fact, the SA 
method theoretically allows finding the global optima of 
modularity, but the exponential complexity restricts it 
only to finding a better local optimum and to resolving 
the network of scale only up to 10 4 . The performance of 
the SA listed in Table [V] was reported in running on an 
Intel PC with two 3.2 GHz processors in [H, wherein the 
authors proposed an accurate method that can be com- 
petitive with the SA method but has very high memory 
demand. We ran the MAGA ten times for all the net- 
works, with predefined generation size, on an Intel PC 
with two 2.93 GHz processors. The last two columns of 
Table [V] shows the number of generations and the run- 
ning time needed to find the maximum modularity in the 
runs. 



The MAGA can also be applied to unipartite networks 
by optimizing the bipartite modularity after the transfor- 
mation as mentioned in Sec. |TTJ Being a kind of genetic 
algorithm, however, the MAGA can directly optimize the 
unipartite modularity as the SGA does (4|| , which distin- 
guishes it from certain methods such as the SR method 
which is required to develop different versions for differ- 
ent classes of networks [13, 13, HE HI] ■ Furthermore, the 
modularity consistency revealed in Sec. |TT] means that 
the MAGA can also more effectively optimize unipar- 
tite modularity so that we only focus on the comparison 
with those well-known methods, including the Girvan- 
Newman (GN) algorithm Q , EO [13 , SR [13, d , and 

sa m. 

To test the performance of the MAGA on unipartite 
networks, we have considered several real networks with 
different scales: the Zachary karate club network [64[, 
the jazz musicians network [65fl (Jazz), the Caenorhabdi- 
tis elegans metabolic network [66| (C. elegans), the email 
network of University Rovira i Virgili [67] (Email), a trust 
network of users of the Pretty- Good-Privacy (PGP) al- 
gorithm for information security [68j |. and a coauthor- 
ship network of scientists working in condensed matter 
physics [69| (Cond-mat). 

The EO and SR methods clearly outperform the origi- 
nal method for detecting communities (the GN method) ; 
they may both be viewed as the representatives of modu- 
larity maximization approaches in that they can achieve 



V. CONCLUSION 

We have shown both that unipartite and directed net- 
works can be equivalcntly represented as bipartite net- 
works, and their modularity is just the corresponding 
bipartite modularity. This implies that bipartite net- 
works can be considered as an extensive class of net- 
works including unipartite and directed networks, and 
that detecting communities in bipartite networks pro- 
vides a uniform framework for solving the problem in 
various networks. Therefore, methods for detecting com- 
munity structure of bipartite networks generally can be 
applied to unipartite and directed networks. 

Wc have presented an adaptive genetic algorithm, the 
MAGA, for the task of community structure detection. 
This algorithm is based on the MOGA which was pre- 
sented with the aim of improving the performance of tra- 
ditional genetic algorithms. But we have shown that the 
MOGA has a poor performance as applied to this task. 
In fact, we have revealed the MOGA would be misguided 
by the allele standard deviation and does not guaran- 
tee the convergence to global optima. In the MAGA, 
we introduced a different measure for the informative- 
ness of loci, a modified rule for mutation and a reas- 
signment technique. These ingredients jointly make the 
MAGA more effectively optimize objective function for 
community structure detection. The experiments on 
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TABLE V: Performance comparison of the MAGA, Girvan-Newman (GN), extremal optimization (EO), spectral relax- 
ation (SR), and simulated annealing (SA) methods in terms of modularity and running time (only for the SA and MAGA) 
for unipartite networks. The modularity in bold font represents the maximum modularity obtained for the network, with the 
corresponding number of generations and time shown in the last two columns. The running time for the SA or MAGA is 
measured in minutes (min) or seconds (s). 













SA(0.999) 






MAGA 




Network 


Size 


GN 


EO 


SR 


Q 


Time 


GcnSize 


Q 


Generations 


Time 


Zachary 


34 


0.401 


0.419 


0.419 


0.420 


12s 


100 


0.420 


5 


0.1s 


Jazz 


198 


0.405 


0.445 


0.442 


0.445 


58min 


8000 


0.445 


7222 


19min 


C.elegans 


453 


0.403 


0.434 


0.435 


0.450 


146min 


8000 


0.452 


3487 


12min 


E-mail 


1133 


0.532 


0.574 


0.572 


0.579 


1143min 


10000 


0.581 


9280 


72min 


PGP 


10680 


0.816 


0.846 


0.855 






20000 


0.881 


19867 


610min 


Cond-mat 


27519 




0.679 


0.723 






30000 


0.802 


29995 


3517min 



bipartite (model and real) networks have consistently 
shown that the MAGA outperforms the MOGA, SGA, 
and BRIM. Compared to BRIM, another advantage is 
that the MAGA can automatically determine the number 
of communities. The results on unipartite networks indi- 
cate that the global optimization method is indeed more 
accurate than the EO and SR methods as expected, and 
that it also can attain or even outperform the accuracy 
of the SA method in a significantly shorter time, which 
is crucial for analyzing large networks. 

The time complexity of each generation evolution of 
the MAGA is O(M), and the overall time demand of 
this algorithm depends on the population size and the 
generation size [70(. Although the MAGA can theoret- 



ically find the global optima of an objective function, 
the quality of solutions delivered by the MAGA rests in 
practice on the generation size given the population size. 
Owing to the lower complexity of each generation evolu- 
tion, we can run enough generations to get a high-quality 
solution. Empirical results showed that the MAGA can 
effectively resolve the community structure of networks 
at many scales up to 10 5 , which have covered many kinds 
of real networks such as social, metabolic, and technol- 
ogy networks. Beyond these scales there are several nice 
local methods available [7lTj74| , while the performance 
of our algorithm on networks with such scales needs to 
be further explored. On the other hand, since a par- 
allel implementation of the MAGA allows each of the 
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most time-consuming operations on Np chromosomes to 
be simultaneously calculated by assigning them to mul- 
tiprocessors of a highly efficient computer, it seems that 
even for networks of millions of nodes the MAGA is still 
a promising method for accurate detection of their com- 
munity structure. 

Methodologically, the MAGA for community detection 
is based on the idea of optimization. So the accuracy 
is determined by the selection of an objective function. 
Here, we use the (bipartite) modularity as the object to 
optimize, which certainly may suffer from the resolution 
problem although this may not be severe for many real 
networks. On the one hand, the resolution problem es- 
sentially is favorable f or g aining deeper insight into the 
structure of networks [36|. On the other hand, the ef- 
fect of this problem may be circumvented or alleviated 
as needed. For example, the MAGA can perform network 
preprocessing with random walk [75| before optimizing or 
take an alternative objective function [76[ instead of the 
modularity. Also we can combine several high-quality so- 
lutions to obtain a more accurate division of the network 
of interest [H, [72[ • 

Overall, the MAGA enables us to accurately and effec- 
tively detect community structure for various networks 



including bipartite, unipartite, directed, and weighted 
networks so long as it takes the corresponding modu- 
larity as the fitness function. The evolutionary method 
can return multiple high-quality solutions with no bias, 
which may provide some useful information on the relia- 
bility of the solutions of interest and may be combined in 
a way to obtain a better solution. Finally, we believe that 
as an effective discrete optimization method (the special 
reassignment technique can be switched off as needed) it 
will find more applications in many fields. 
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